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ABSTRACT 

We present simple formulae for calculating the skewness and kurtosis of the aperture mass 
statistic for weak lensing surveys which is insensitive to masking effects of survey geometry 
or variable survey depth. The calculations are the higher order analogs of the formula given 
by Schnei der, van Waerbeke & Mellierl d2Q02l) which has been used to compute the variance 
of the aperture mass from several lensing surveys. As our formula requires the three-point 
shear correlation function, we also present an efficient tree-based algorithm for measuring it. 
We show how our algorithm would scale in computing time and memory usage for future 
le nsing surveys. We also apply the procedure to our CTIO survey data, originally described 
in I.Tarvis ef alJ fe003ll. We find that the skewness is positive (inconsistent with zero) at the 2a 
level. However, the signal is too noisy from this data to usefully constrain cosmology. 
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1 INTRODUCTION 

One of the most useful statistics for probing the pow er spectrum with weak gravitational lensing is the aperture mass statistic, M ap . The 
aperture mass was first proposed by Kaiser 1 1995) and Schneider 1 1996) for estimating the masses of clusters, and is essentially an estimate 
of the convergence, k, within a circular aperture. The convergence is proportional to the projected mass density relative to the average value 
in the field, hence the name aperture mass. 

Since the aperture mass is relative to the average value, the statistic (M ap ) gives over the whole field. However, higher order moments 
of the aperture mass are non-zero, and the var iance, (Mj), has proved very useful for weak lensing researchers, with many studies to 
date i ncluding this statistic in their analysis (e.g. lHoekstra. Yee & Gladdersl hooa l^n" Waerbeke et al. 2002; Jarvis et al. 2003; Hamana et al. 
2003). This statistic is a good probe of the power spectrum, and thus measures the extent of the Gaussian fluctuations of the density of the 
universe. 

However, any non-Gaussian component is not measured by the variance. Further, the density fluctuations must be non-Gaussian, since 
the density contrast is constrained to be greater than -1, but it can be arbitrarily large in the positive direction. Clusters generally have density 
contrasts of more than 200. Thus, the density fluctuations, which have a mean of by definition, must be skewed towards po sitive values. In- 
deed, this non-Gaussi anity has recently been detected in the VIRMOS-DESCART survey by two groups iBernardeau. Mellier & van Waerbeke] 
l200aben et all2003h . 

The lowest order measures of the non-Gaussianity using the shear field are called three-point statistics, since they require measurements 
of three shear values and their relative positions and orientations. Three-point statistics have t he potential to be very useful for c osmology, 
since they can, in combination with two-p oint statistics, determine f2 m and ag independently iBernardeau. van Waerbeke & MellieJI 19971 
I.Tain & SeliakHl997tlSchneider et alJll998l) . Th ey can also be used to constrain the properties of halos such as the inner slope and typical 
concentration parameters iTakada & Jain 2003). In this paper, we investigate a particular three-point statistic, the skewness of the aperture 
mass, (MfJ. ( 

^) proposed the use of the aperture mass for cosmic shear measurements, they introduced the form of the 
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Schneider et alj 



statistic which has been used by almost all weak lensing studies to date. However, we find that calculations of the three-point statistic is 
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significantly easier using the form introduced bv lcrittenden et af] i2002h . Namely, 

M ap ( J R) = / d 2 rU R (r)K{r) (1) 
d 2 rQ R {r) lt {r) (2) 



U *v = ^{ l -m)^{-m) (3) 

Qn{r) = -U R (r) + !■ / r'dr'Z7«(r') (4) 
r Jo 

expf-il (5) 



4jrB* *\ 

where k is the converg ence, 7 t is the tan gential component of the shear, and r is measured from the centre of the aperture. 

We also note that lzhang et alJ 120031) has shown that this form of the aperture mass is more sensitive for constraining Q. m than other 
forms. 

There are several features of the ape rture mass s tatistic which makes it very useful. First, it probes the power spectrum with a very 
narrow window function, VFfn^. lschneider et alJll998l) derive the relation: 

(Mi p )(R) = i / d 2 kP(k)W{kR) (6) 

W{kR) = U R (k) 2 (7) 
-k 2 R 2 ( k 2 R 



2 



■ exp 



(8) 



W(r,) = ^exp(-r ] 2 ) (9) 

where the tilde (~) indicates the Fourier transform. This window function W(rf) oc 77 4 for small 77, and drops super-exponentially for large 77, 
peaking at 77 = 2. 

Another benefit of the aperture mass is that it has (nearly) finite support in real space, so it is calculable. The ideal window function 
would be a delta function, W(rf) — £0(77 — 770), so the statistic would directly probe P(r)o/R). However, to have infinitesimal extent in 
k-space would require infinite extent in real space, and thus be incalculable. (M 2 p ) thus has the advantage that it is compact in real space as 
well 1 . 

The third benefit is that it measures purely the so-called E-mode of the shear field. There is a corresponding statistic, M x , which 
measures the B-mode: 

M X (R) = J d 2 rQ R (x)~{ x (r), (10) 

where the cross-component of the shear, 7 X , is oriented at an angle of 45 degrees relative to 74. Then the variance of this measure, (M x } is 
a measure of the B-mode power in the shear field, which is generally taken to be a measure of the co ntamination from residual systematics, 
such as uncorrected effects of the point-spread functions. Intrinsic alignments Icrittenden et all2'() 02) and source clustering ISchneider et alJ 
2002) can also produce B-mode signal, but the level of both of these is generall expected to be very low for scales larger than 1'. 

The only problem with calculating these statistics directly is that real lensing surveys have regions which are masked out due to survey 
geometry, bright stars, bad seeing, etc. Thus the aperture mass statistic runs the risk of losing azimuthal symmetry due to the masking. Since 
the statistic depends on azimuthal symmetry for the angular integrals, a direct calculation of {M 2 p ) and (M x ) will leak some of the E and 
B-mode power into the other statistic. This may only be of order a 10 per cent effect or less for typical surveys, but with the goal of precision 
cosmo logy, another method of calculation is typically used, first derived by Icrittenden et alJ 120021) and then refined bv ISchneider et all 



(2002 

They have shown that the aperture mass variance can be calculated from an integral of the correlation functions in an equation of the 
form: 

WiR) = y§[Ms)T + (l) + ^s)T.(±)] (11) 

<^>(*) = i/f [M-)T + (£)-M<)T_(£)] (12) 

where £+,- are the two-point shear correlation functions (defined more precisely in |5J a nd T+._ are kno wn functions (also defined in [J2j- 
The third moment of the aperture mass, (MjL,) was originally suggested bv lSchneider et all j 19981) as a statistic for investigating the 
non-Gaussianity of the shear field. It is useful as a cosmological probe, since (as these authors showed) its dependence on Q m and erg are 



Technically, the aperture mass does have infinite extent as well; however, the exponential cutoff is so sharp that the window function is effectively finite for 
real calculations. 
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somewhat orthogonal to that of (Mf p ), so the combination of the two statistics can be used to determine fi TO and erg separately. In particular 
{M't p ) /{Mip} scales approximately as l/fl m , independent of erg. 

However, a direct measurement of (Mf p ) using apertures would be even more affected by masking than the variance is. Thus, in this 
paper, we derive formulae similar to Equations 1 1 1 1 1 2i for (Mf p ) and (M* } in terms of the three-point correlation function. 

The signal-to-noise (S/N) for (M^ p ) is significantly lower than for (M 2 p ). As we will show bel ow, our total S/N f or (Mf p ) is of order 
unity for the roughly 10 6 galaxies in our CTIO survey, compared to a S/N of about 7 for (Mf p ) ijarvis et"alll2003h . As the S/N scales 
as Ng[\, we would need about 10 8 galaxies to make a good measurement of (Af| p ). Planned surveys such as those from the Supernova 
Anisotropy Probe (SNAP) and the Large-aperture Synoptic Survey Telescope (LSST) will do just that. Actually, the S/N also increases when 
one goes deeper, which both of these telescopes will do, so their S/N should be somewhat better than 10. In any case, the large number of 
galaxies from these future missions demands efficient algorithms for calculating (Mf p ), such as the one presented herein. 

In J3| we define our notations and briefly rederive the above equations < I 11121 in a manner that will lend itself to generalizat ion to the 
three-point version. This is important, since we found that the derivations given byjgri^tendene t all 120021) and lSchneider et all I2OO2I) do 
not generalize easily to the three-point case. The method is fairly similar to that of Pen et al. 1 2003), although they use a tensor formulation, 
which is a bit unwieldy, having 64 components for the three-point correlation function (8 of which are unique). Then Sj5|will derive the 
corresponding equations for the three-point statistic, ^describes our algorithm for calculating the three-point correlation function. In Sj5] we 
then apply the formulae to simulated data to check their validity, and also to our CTIO survey data. 



2 VARIANCE OF THE APERTURE MASS 

We follow the notation o f lSchneider etai]l2002t) to describe the E and B-mode components of the shear field by defining the lensing potential, 
ip, to be complex. 

iP = il} E + iip B (13) 
The convergence, k, and shear, 7, are related to the potential by 



1 ( a . a , 



(14) 



(15) 



Since the convergence is the projected matter density, it is real for pure lensing fields. Thus, lensing produces only E-mode fields, and the 
B-mode is generally used to check for residual systematics or the aforementioned effects of intrinsic alignments or sourse clustering. 

With a complex k, Equation Q would give us a complex value for M ap . However, using Equation 1101 . and defining M = M ap + iM x , 
we find 



M{R) = J d 2 rU R (r) K (r) 



d rQ R (r)j(r)e 



-2i<j> 



where <j> is the polar angle of r. 

Using this definition, the expectation value of M 2 is 

(M 2 )(R) = J d 2 n J d 2 r 2 Q Ji (r 1 )Q Ji (r 2 )(7(r 1 )7(r 2 )) e - 2!Wl+02) 

We now define the 'natural components' of the two-point shear correlation function as: 

= (7(r)7(r + s)e~ ila ) 

£+00 = <7(r)7*(r + «)) 
where a is the polar angle of s. 

Then, Equation 1 1 8i becomes (taking r 2 = v\ + s, and dropping the subscript for n) 



{M 2 )(R)= J d 2 r J d 2 sQ R {r)Q R (\r + s\)!i-{s), 



2 i( 2 a-0 1 -^ 2 ) 



To evaluate this, we treat r and s as complex numbers, so s — s exp(ia), r = r exp(ic6i), and r + s — \r + s\ exp(i</> 2 ). Then 



(M 2 )(R)= / d 2 s£_(s)— / d 2 rQ R ( ri )Q R (\r + s\) 



(r* + s*) 2 r* 



= / i?£-oo 



sds . 



128i? 4 



exp 



-—) 
AR 2 I 



R 2 



R 



(16) 
(17) 

(18) 

(19) 
(20) 

(21) 

(22) 
(23) 
(24) 
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where * indicates complex conjugate. (The evaluation of the integral in Equation 1221 is straightforward in Cartesian coordinates, where each 
complex value is represented as, for example, r — x + iy.) 
A similar procedure can be used to evaluate {MM*): 



{MM*)(R) = / d 2 r / d 2 sQ R (r)Q R {\r + sj)( 7 (r) 7 *(r + s))e 



(r + s) 2 r* 2 



-2i(0i-<£ 2 ) 



d s£+(s) / d rQ R (r)Q R (\r + s\ 



sds 



16R 2 s 2 + 32R 4 



128i? 4 



■ exp 



r + s\ 2 r 2 
AR 2 ) 



sds 



Finally we can convert these expressions into formulae for {M 2 p ) and (Mj) individually through the expressions: 

<M a 2 p ) +i(M ap M x > = ~ ((MM*) + (M 2 » 
<Af 2 } _ i<M ap M x ) = i ((MM*) - (M 2 )) 



So, 

(M a 2 p )(i?) + i(Af ap M x )(7?) 



sds 



(M 2 )(R) - i{M &p M x )(R) = \j sds (±) - f_( S )T_ ( 



s 

-) 



where 



- 16a; 2 + 32 



128 



128 



exp 



■ exp 



(25) 
(26) 
(27) 
(28) 

(29) 
(30) 

(31) 
(32) 

(33) 
(34) 



This result agree s with that obtained from the formulae in lSchneider et al as well as the formulae in lcrittenden et al] 12002 

Also. |Penetaill2003h gives a formula for a tensor form of Equations|^and|32| which can be expanded to agree with our results. 



3 THIRD MOMENT OF THE APERTURE MASS 

We now follow the same procedure as in j5|for evaluating (M^) and (M 2 M*). 



(M 3 )(R) = -jd 2 rj d 2 s J d 2 tQ R (r)Q R (\r + s\)Q R (\r + t|)( 7 (r) 7 (r + s) 7 (r + t )) e - M <*+*»+*») (35) 
For the 'natural components' of the three-point correlation function, we follow lSchneider & Lombard] |2002 ) and define: 

r (s, t') = ( 7 (r) 7 (r + s) 7 (r + t) c - ai(o,I+0,a+aa) ) (36) 

Tifat') = ( 7 *(r) 7 (r + s) 7 (r +t)e- 2l( - Q1+Q2+a3) ) (37) 

T 2 (s,t') = ( 7 (r) 7 *(r + s) 7 (r +t)e- 2l(ai - a2+Q3) ) (38) 

r 8 (*,f) = ( 7 (r) 7 (r + s) 7 *(r +*)e- 2l(Q1+Q2 - Q3) ) (39) 



where t' — ts* / s. Since we have three points now, they form a triangle, which is depicted in Fig.Q The parameter t' is simply t when the 
triangle is oriented such that s is parallel to the x-axis. Since the correlation functions are rotationally invariant, they are properly a function 
of only three real variables, s uch as (s, t'), rather than four (s,t). 

ISchneider & Lombardil l2002h give several possible definitions for ctj. For our purposes, it does not much matter which definition one 
takes, as long as the en's correspond to directions which rotate with the triangle. The simplest example is for all three to be equal to the polar 
angle of s 3 . 

s = se la (40) 
Qi = OL2 = as = a (41) 



2 We note however, that the results quoted in Crittenden et al. 1 2002 ) for W and VV (which correspond to our and T_ , but for a slightly different 
normalization for U) are too low by a factor of 7r/2. 

3 For other definitions of these angles, a';, simply multiply the final formula for To by exp(2i(a' 1 + a' 2 + 03) — 6ia) and T\ by exp(2i(— a' ± + a' 2 + 
a' 3 )-2ia). 
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*1 S ^2 

Figure 1. Graphical representation of the q parameters used in the formulae for Tq and T\. These vectors are used as complex numbers in the formulae. 



Then, with these definitions, 



(M 3 )(R) 



d> / dfs / d 2 tQ R {r)Q R (\r + s\)Q R (\r + t\)V (s,t')e 



d z s I d 2 tr (s,t')^r I d 2 rQ R (r)Q R (\r + s\)Q R (\r + t\) 



/\ 6ia — 2i(</> 1 +</> 2 +</>3) 



r* 2 (r* + s*) 2 (r* +£) 
r 2 \r + s\' 2 \r + t\' 2 



„*2„*2„*2 
1l 12 

24i? 6 



exp 



ql + 92 + gf 

27? 2 



# y 2^ r ° (s '* )To U'^ 

where are the vectors from each vertex to the centroid of the triangle 4 . They are depicted graphically in Fig.Q Algebraically, 

q x = (s + O/3 
q 2 = (t' - 2s)/3 
q 3 = (s- 2t')/3 
Likewise, 

/\2ia-2i(-0i +02 + 03) 



{M 2 M*)(R) = - I d 2 r I d 2 s I d 2 tQ R {r)Q R (\r + s\)Q R (\r +t\)T 1 (s,t')e 
(ft' 



/sds f 
~R? J 



2-kR 2 
sds f d 2 t' 



ri(a,t' 



If- J 2~ff2 ri(s '*' )Tl \ /.'■ />' 



24i? 6 



9R 4 



27R 2 



exp 



ql + ql + gg v 

27? 2 



The functions To and Ti are thus: 



r ( a ,t) = -^^^exi» 



2 1 2 1 2 

9l + <72 + 93 



Ti{s,t) 



24 "V 2 

gig2 2 gf _ giglgl , gi 2 + 2^293 

24 9 27 



exp 



<7i 2 + ql + ql 



(42) 
(43) 
(44) 
(45) 

(46) 
(47) 
(48) 



(49) 
(50) 

(51) 
(52) 



where s, t, and the q's are now dimensionless quantities. 

As an aside, we note that integral formulations for To and T\ are possible for any aperture function U R (r). However, the particular form 
we use here is the only one for which we have been able to calculate closed forms for To and Ti . In particular, the Gaussian exponential term 
makes the integration step from Equation 1431 to I44i (and the corresponding step for Xi) straightforward, although rather messy. 

Fig. [2] shows the absolute value of To and Ti for equilateral triangles as a function of the triangle side length. Note that the T are 



These are similar to the qi vectors used bv lPenetalJ<2003t) . Their g^'s are 3 times the values our qi's. 
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Figure 2. The absolute magnitude of the functions To and T\ for equilateral triangles as a function of the side length of the triangle, s. The solid green curve 
is | To | , and the dashed blue curve is | Ti | . 



significant for triangles with side lengths up to about 4R, so to measure (M 3 p )(R) up to R = 10', one would need a survey at least about 

40' on a side. Interestingly, this is about the same size as one needs for the corresponding measurement of (M ap )(_R). 
Finally, we use the relations: 

<M 3 > = (M a 3 p ) + 3i<M a 2 p M x > - 3(M, P M 2 ) - i(M x ) (53) 

(M 2 M*> = (M a 3 p ) + i(Mi p M x ) + {M^Ml) + i(M x ) (54) 

to obtain: 

(M'i p )(R) = i$R {3(M 2 M*)(R) + (M 3 )(R)) (55) 

(Mi p M x )(R) = 1q ((M 2 M*)(R) + (M 3 )(R)) (56) 

(M ap Ml)(R) = i$R ((M 2 M*)(R) - (M :i )(R)) (57) 

(M X )(R) = (3(M 2 M*)(R) - (M :i )(R)) (58) 



There is one further complication for using these formulae. In practice, one does not calculate Ti(s, t) for every (s,t), since this would 
include every triangle six times. A typical program would bin each triangle according to its shortest side, s, and its second shortest side, t, 
thus counting every triangle only once. 

In this case, let us assume that we know Ti(s, t) for all s, but for only those t where |t — s\ > t > s. Then naively, Equation J45> 
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becomes 

(M 3 )(R) 



sds 

l^ 2 

s<t'<\t'—s\ 

sds 



d 2 t' , ( s t' 

ro( S ,i)To(-,- 



s<\t' —s\<t' 



+ 



+ 



sds 

l^ 2 

sds 

l^ 2 

sds 

l^ 2 

sds 
"R 2 



t'<s<\t' — s\ 



t'<\t'— s\<s 



\t' — s\<s<t / 



2-kR 2 

d 2 t' 

2%R 2 

d 2 t' 
2-kR? 

dH' 
2%R 2 

dH' 

2-kR 2 ' 

2*' 



t' St'* 



To t 



, st" 



i?' t'R 



To t 



/ / 1' (t'-s)t' 



To - 



r If - s| 



s(t' - s)' 
It' -si 



i?' t'_R 

i'-s| -s(i'-s)* 



To 



R 



It' - sli? 



d 2 t 

2tvR 2 



To \t'-s\. 



fit! - s)*\ (\f -s\ t'(t'-s)* 



It' - s| 



To 



R 'It' - s\R 



\t'—a\<t'<3 

However, for each of these integrals, a change of variables transforms it into the first one. Thus, we have the much simpler result: 



<M 3 )(i?)=6 J 



sds 



1*1 



d z t 

2-kR 2 



U{s,t')To 



1 t 

R' R 



s<i / <|t / — s\ 

The case is almost the same for {M 2 M*}, except that the conjugated vertex changes in 4 of the integrals, so the net result is: 



<M 2 M*>(fl) = 2 J £| 



d 2 t' 



s t' 



R' R 



2-kR 2 

s<t'<\t'— s| 

where T2 and T3 are obtained from Tj by a cyclic rotation of the indices 



riMT! T 2 (s,t')T 2 -, - +r 3 (s,t')T 3 -, - 



s t' 



R' R 



s t' 



R R 



(59) 



(60) 



(61) 



The extension to the kurtosis of the aperture mass ((M ap )(i?)) is straightforward and is given in the appendix. 



4 EFFICIENT CALCULATION OF THE THREE-POINT CORRELATION FUNCTION 
4.1 Two-Point Algorithm 

We consider an efficient algorithm for calculating the two-point correlation functio n first. This pr oblem has been solved in a number of ways 
for the purposes of number count correlations for studying large-scale str ucture. iBertschingei) gives a review of some techniques 

for that purpose. The algorithm we present below is most similar to that of M oore et alJ yQOO). There are a few complications due to the 
fact t hat we are corr elating a vector field rather than a scalar number count, but the creation and use of the cells are quite analagous. Also, 
lzhang&PerJl2004l) have recently presented a similar algorithm for the purposes of weak lensing. Therefore, we present a brief description 
of the algorithm below in order to describe the unique features of our version of the algorithms, and we defer to these other two papers for 
more comprehensive descriptions of the general method. 

The most naive algorithm for calculating the two-point correlation function would be to take every pair of galaxies in the field and put 
the product of the shears into a bin corresponding to their separation. Then the average value could be found for each bin. Clearly, this is 
extremely slow for large datasets, being an 0(N 2 ) algorithm. The corresponding three-point algorithm is even worse, 0{N 3 ). 

However, we note that the binning inherent in even this brute-force technique is effectively smoothing the correlation function by the 
size of the bins, since we lose all information about where within each bin the pair really belongs. If we allow ourselves to smooth again on 
this scale, then we can greatly speed up the algorithm. 

First, binning is usually done in logarithmic bins in the separation, d, with some specified minimum and maximum scale, d m i n < d < 
dmax- Now, suppose we have two circular regions of radii si and S2, whose centres are separated by a distance d c . Then, then the separation 
for any pair of galaxies taken from these two regions must fall in the range d c — si — s 2 < d < d c + si + s 2 . If the radii are much less than 
d, this becomes 

ln(d c ) - £i±ii < ln(d) < ln(d c ) + (62) 

CLc Clc 

So, if our regions satisfy 

— j < (63) 

d c 





Figure 3. A sample calculation step in the two-point algorithm. At first the two Cells are too large compared to the distance between them, so they need to be 
split up. After the splits, there are four pairs of subcells to be considered. Three of the pairs (marked with green solid lines) satisfy Equation 1631 , and thus can 
be computed directly. The fourth (marked with a red dashed line) needs to be split again. 



where b = A(lnd) is the bin size, then we can put all of them into the same bin, knowing that we will be wrong by at most one bin for any 
specific pair of galaxies. 

In this case, we can take the average of the shear values in each region and place the product of the averages in our bin, dropping the 
calculation time for these galaxies from 0(NiN%) to 0(Ni + N%), where Ni and N2 are the number of galaxies in each region. When the 
N's are large, this can be a huge speedup. 

The s pecific technique for implementing this type o f algorithm is to create a so-called kd-tree (with k=2 in this case) for the data. Again, 
we defer to lMoore et all j.2000) and Zhana & Pen 1 2004) for a more complete description of this type of data structure. 

In short, we bin the data in 'Cells', keeping track of various average values for the galaxies in each Cell along with its centroid and size 
(maximum distance of a galaxy from the centroid). Each Cell with more than one galaxy (and larger than d m i n ) also contains pointers to two 
sub-Cells with half the number of galaxies each. 

When Equation 1631 is not satisfied, we split each Cell (roughly) in half and check the resulting pairs of regions. Cell splitting continues 
in this manner until all pairs can be calculated directly. Note that the recursion must stop at some point, since si + S2 = for two individual 
galaxies, so Equation I63i must be satisfied eventually. 

Fig.|5|shows an example of a splitting decision in the algorithm. At this point we are calculating the correlation between all the points in 
Cell 1 with all the points in Cell 2. In the top half of the figure, si + S2 is too large compared to d c , so we split both Cells into their subcells. 
There are now 4 pairs of subcells to consider. The three marked by the green solid line now satisfy Equation 1631 . and can be calculated 
directly. But the 1 A-2A pair marked by the red dashed line needs to be split again. 

We tested this algorithm on a 2.4 GHz Xeon processor for a square field 200 arcminutes on a side, and found the calculation time and 
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Figure 4. The geometry of the triangle in the discussion of the three-point algorithm. We use the slightly non-intuitive convention that d\ > d,2 > d 3 (rather 
than the other way around). This is so we agree with Fig.Qwith s as the smallest side, as desired for Equations l60l and loT1 



memory usage to be 5 : 

/ jy \ 1.0 / , \ -1.7 

Tw3 - 6 (jov I^tJ seconds (64) 

M ~ 19 \T¥j MB (65) 

Even for N ~ 10 s , which will be required for surveys like those of the Supernova Anisotropy Probe (SNAP) and the Large-aperture Synoptic 
Survey Telescope (LSST), this algorithm still only takes about an hour and uses less than 20 GB of memory, which we expect will be common 
on destop computers by the time such surveys become available. Therefore, we do not foresee any need to improve upon this algorithm in 
the near future. 



4.2 Three-Point Algorithm 

For the three-point function, the triangles are parametrized by three values (since we ignore any overall rotation or translation). One possible 
parametrization would be to use the three side lengths: di > > d 3 . This is not a good choice for two reasons. First, the orientation of the 
points is lost - a non-isocolese triangle is not distinguished from its mirror image. Second, the range of the parameters is a function of the 
other parameter values (e.g. d\ — d2 < d 3 < dz). 

A better choice is the following for a triangle with d\ > d 2 > d 3 : 

r = ds r min < r < r max (66) 

u = d 3 /d 2 < u < 1 (67) 

v = ±(di - d 2 )/d 3 -Kv<l (68) 

where v is positive when (x\ , X2,x 3 ) are oriented counter-clockwise and negative when clockwise. 

We bin uniformly in ln(r), it and v with a bin size b. Now, if we have three circular regions of radii si, S2, S3, whose centres make a 
triangle with di > d 2 > d 3 (see Fig.|4j, we can use the average shear in each when all of the following conditions hold: 

Sl + S2 



d 3 



< b (69) 

f < (70) 

d 2 u 

f < ~n==, (71) 

d2 Vl — v 2 

When considering three Cells for calculating the correlation function, if the sizes satisfy the above criteria, we use the average shears 
for each and place the product into the bin corresponding to the triangle formed by their centres. Otherwise, we recurse down to the sub-Cells 
of (some or all of) the Cells and try again. 



Technically, there must also be a ln(iV) factor in the memory, and possibly the time, but this factor is essentially a constant over the range of N we tested, 
so we neglect it. 




Figure 5. The geometry of the PairCell's for the improved three-point algorithm. The pairs on the left side of the figure all have midpoints which are near each 
other, and have cells at roughly the same separation. Thus, they all fall into a single PairCell. The dots represent the midpoints of each pair. The size of the 
PairCell, S\2, is based on the scatter of these midpoints. The distance from the PairCell to another Cell is then the median of the corresponding triangles, c£ m . 



This algorithm is far faster than the naive brute-force approach of taking every triple of galaxies which would be 0(N 3 ). For the same 
field and processor mentioned above, we find the computation time and memory usage for our implementation of this algorithm to be: 

/ at \ 1-4 / 7 \ —3.3 

t ~ 415 (icf) (or) minut es (72) 

N x 1 '° 



M « 22 l^—J MB (73) 

IZhang & Per] f20Q4) show some empirical computation times for their algorithm as well, which are shown in their fig. 3. Although their 
tests were made at somewhat lower values of N than our tests, we believe that their scaling law at the same fiducial values we quote above 
is likely to be similar to ours. 

The algorithm takes the most time dealing with triangles which are a significant fraction of the size of the field. So for very large fields, 
when one is only interested in the correlation on small scales, the calculation will be somewhat faster than this, as the larger triangles can be 
ignored. 



4.3 A Faster Algorithm 

We have discovered a way to significantly improve upon this algorithm, at least for large values of N. However, it can require a significant 
amount of memory, so depending on one's computing resources and survey size, it may not always be a viable option. 

In this algorithm, we proceed one bin at a time in ln(r) rather than doing all at once as we did in the above algorithm. For each bin, we 
find all pairs of Cells with (si + S2)/ds < b where d% falls in the bin in question. Then, we find the midpoint Al of each pair, and create a 
new kd-tree for these pairs using these M's for the positions of each pair. (See Fig.|5|) 

Then, each 'PairCell' refers to a collection of pairs. For each, we keep track of the mean M position, and the scatter of the M's, which 
we call Si2. We don't average the data for all of the pairs in the bin, since the triangles made from these pairs and a third point can still 
have very different shapes depending on the orientation of the pair. But two pair with similar orientations will go into the same r, u, v bin. 
Therefore, within each PairCell, we bin the pairs according to their orientation, using a total of 4/b bins in order to maintain our rule of being 
off by at most one bin in u and v. 

Finally we correlate the PairCell data with the third points of the triangles by traversing the tree of pairs and the original tree of Cells 
described above. For a given PairCell with size s\2 and a normal Cell with size S3, separated by a distance d m 6 , we split one or both unless 

^±ii < b (74) 

Note that for each PairCell and Cell which satisfy this requirement, the various pair orientations stored in the PairCell span the entire 
range of — 1 < v < 1. Also, if dz/d m is not very small, we need to calculate u separately for each orientation as well, since u can vary by 
several bin sizes. But for each individual orientation, the above equation guarantees that we are only smoothing by one bin in u or v. 

When one typically has many pairs in each PairCell at the point where Equation <74t holds, then this algorithm would be expected to 



6 The m stands for median, since this line is the median of the triangle from vertex 3 
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be significantly faster than our previous algorithm, which effectively does each pair individually. This is usually true for the large triangles 
where the algorithm spends the most time, so we find that this algorithm does turn out to be significantly faster than our previous one. 
Empirically, the calculation time and memory usage for this algorithm are found to be: 

T ~ 161 (i^) " (err) " mmutes (75) 

"*«»(io0 Ui) MB (76) 

This is a factor of 2.6 faster than the previous algorithm for our fiducial values, and the speedup increases with N, since it scales as a smaller 
power of N than our previous algorithm. 

However, for fields with N > 10 6 , the memory demands become difficult for a desktop machine (although still feasible with modem 
supercomputers). For the SNAP or LSST surveys with N « 10 8 , this algorithm will require almost a terabyte of memory. This would be a 
lot by today's standards, but when the data from these surveys eventually become available, it will probably be quite manageable. Likewise, 
this algorithm would take a good fraction of a year for the desktop machine described above, but in five or ten years, it will must faster. In 
any case, this algorithm is extrapolated to be about 20 times faster than the previous algorithm. 

If anyone is interested in the code for any of these algorithms or would like to use our code on a survey dataset, we would be happy to 
provide it for you. If interested, please send an email to Mike Jarvis (mjarvis@hep.upenn.edu). 



5 APPLICATION TO SIMULATED AND OBSERVED DATA 
5.1 Simulated Shear 

To ve rify that the above equations are correct, we first apply them to a simulated shear field produced by ray tracing simulations [fain. Seliak & Whit! 
l2000h . The simulated data uniformly cover an area 200' on a side, and have no shape noise on the shear values. The particular simulation we 
used was for ACDM with fi m = 0.3, A = 0.7, T = 0.21 and <r 8 = 0.9, with sources at z = 1. 

We measure (Mf p ) both from the correlation functions as described above and also by measuring M ap for many apertures using 
Equation {2} and then computing (Mf p ) directly. This second technique is not possible on real data, due to masking effects from bright stars 
or complicated survey geometry, as discussed in §\\ but for the simulated data, there is no masking, so the direct computation is possible. 

The results from the two methods are shown in Fig. [6] The results from the direct aperture method (Equation J5J) are the dashed green 
curve. The results from the integration method (^3} using b = 0. 1 are the solid blue curve. The error bars for the direct method were estimated 
from the distribution of M ap values used to compute the average. For the integration technique, we took 12 samples of 100,000 points drawn 
from the gridded simulation field, and computed the error bars from the actual 'field-to-field' variance. The sampling is needed to prevent 
the gridding signature of the simulation from showing up in the correlation functions and affecting the results. 

The two methods are seen to agree to within the error bars over most of the range, so we believe the formulae in *0| to be accurate. The 
largest discrepency is at scales just above 10' where the direct measurement abruptly falls off. We think that this is due to there being too few 
apertures in the field to accurately calculate the skewness. At R = 10', Ur(v) is significant out to 20', so one can only fit 5 non-overlapping 
apertures of this size within the 200' square. We do use overlapping apertures for our calculation, since the measurements should still be 
nearly independent when the apertures overlap somewhat, but we believe that the relatively small number of independent apertures is likely 
to be the source of the discrepency. 

The figure also shows the mixed and B-modes measured by the integration technique: (M% p M x } is the cyan dotted curve, 
is the magenta long dashed curve, and (M* } is the red dot-dashed curve. The plotted values are the absolute values of the measurements, 
and the largest of these ((M ap Mx }) is roughly two orders of magnitude below the E-mode signal. This is due to the use of b — 0.1; using a 
smaller value would reduce the leakage of power from the E-mode to the others still further. 



5.2 CTIO Survey Data 

We next apply the above formulae to our 75 square degree CTIO ( Cerro To lolo Interamerican Observatory) survey. A complete description 
of the survey data and the processing techniques used is found in ljarvis et alJ i2003h . The 75 total square degrees are divided among 12 
fields, each roug hly 2.5 degrees square. Each field has of order 150,000 usable galaxies. We have also corrected the error pointed out by 
Hirata & Seliak (2003) in our dillution correction formula. The data presented here use their linear approximation for this correction, which 
they find to be significantly more accurate than our formula. 

In our previous paper, we measured the (M ap ) statistic and found a cle ar detection of the E-mode signal, but found significant B-mode 
as well. We had used the aperture mass definition of lSchneider et alJ I2OO2I) rather than the one defined herein (Equation Q), so we present 
the results for this definition in Fig. [7] The E-mode ((M£ p )) is shown in blue, and the B-mode ((Mx }) in red. The m easurements shown ar e 
spaced by approximately a factor of 2 in R, since this is where the measurements become independent of each other. Ischneider et all2 002). 
Note that the error bars are calculated from field-to-field variation, so they accurately represent both statistical and sample variance. 

Our calculation of the (M^ p ) statistic, as well as the mixed and B-mode statistics, are shown in Fig.|8|for the range of V < R < 20'. 
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Figure 6. A comparison between a direct calculation of (Mf p ) and the integration technique presented in this paper for a simulated shear field with no shape 
noise or masking. The direct calculation is the green dashed curve, and the integration technique is the blue solid curve. The three other curves near (and often 
below) the bottom of the plot are the absolute values of the mixed and B-modes measured by the integration technique. 



Less than 1', the B-mode contamination in the two-point measurement became large compared to the signal. And greater than 20', we do not 
fully trust the measurement, since this is where the integration and direct measurements differed for the simulated field described above. 

With the integration technique, we can calculate a value for (Mf p ) at any value of R. However, as with the variance, we plot them 
spaced by a factor of 2 in R, since this is when the points become uncorrelated. Also, the error bars are again calculated from the field-to- 
field variation, so they inlude sample variance. We omit the error bars for the mixed and B-modes for the sake of clarity, but their size is 
similar to those of the E-mode errors. 

Unfortunately, the detection is fairly marginal. The best conclusion we can draw is that (Mf p ) is roughly the right order of magnitude 
to be consistent with a concordance ACDM model, given as the solid black curve in Fig. 00 ( We scal ed the measurements from the above 
simulation to as = 0.8 and to z 3 = 0.6 based on the theoretical scaling found bv lTakada & Jaii] j2002h .) However, the S/N is quite low, and 
the mixed and B-modes indicate that there are potentially significant systematic errors contaminating the results. Thus, we do not try to use 
this result to constrain any cosmological models. 

One test we are willing to make with this data is to test whether it is consistent with zero. The statistic we use for this test is: 

r _ 1 v {Mj p )(R t ) 

Over the range plotted, we find that P = 1.09±0.48 7 , which is a detection at the 2.3<r level. The other three modes give values of 0.31, 0.14 
and -0.13, all within la of zero. So, we do have a mildly significant detection that the skewness is positive, but we cannot make any stronger 
claim than that. 



7 The variance of P is 1/N. Since our integration technique gives measurements essentially continuously within the range from 1' to 20', we calculate P 
using all of these points. For the uncertainty, we use the effective number of independent points: N = ln(20)/ ln(2) = 4.3. 
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Figure 7. The variance of the aperture mass statistic using the integration technique for our CTIO 75 square degree survey. The E-mode component is blue, 
and the B-mode component is red. As discussed in Jarvis et al. i2003), there is evidently significant B-mode contamination at scales less than 10'. The black 
curve is the theoretical prediction for the concordance ACDM model. 



6 CONCLUSION 

We have presented a new calculation for the skewness of the aperture mass as an integral over the more easily measured three-point correlation 
function. We have also presented efficient algorithms for calculating the three-point correlation function in real data. These algorithms are 
believed to be faster than other published algorithms. Finally, we have applied these methods to our CTIO survey data, for which we do not 
obtain a strong detection. The signal is consistent with the concordance model prediction, but is only inconsistent with zero at the 2a level. 

However, we expect that this method will be useful for ongoing and future surveys which are deeper and wider than ours, and will 
therefore be better able to detect the sig nal. |Penetal1l2003l) have already used a similar technique on the VIRMOS/DESCART survey, and 
have measured an E-mode signal which is significantly greater than the mixed and B-mode contaminations (at least for some values of R). 
They find f2 m < 0.5 at 90 per cent confidence. 

There are several ongoing and proposed surveys which will substantially increase the S/N in these measurements. For example, 
the Canada-France Legacy Survey (CFLS) will cover 170 square degrees at a depth of r < 25.7. This is a similar depth to the VIR- 
MOS/DESCART survey, but covers about 20 times more area, so the S/N would be expected to increase by a factor of 4. 

The CFLS will observe three disjoint fields, two of 49 square degrees and one of 72 square degrees, with an estimated number of 
galaxies of roughly 600,000 and 900,000 respectively. Thus, the total calculation time for our fastest three-point algorithm will be about 3 
days and require a maximum of about 7 GB of memory on a modern desktop computer, which is quite feasible. 

In fact, this would calculate the correlation function for the entire range of scales available (up to 7 or 8 degrees), which is not necessary. 
Limiting the calculation to scales less than about 200 arcmin, where the signal is strongest, speeds up the calculation significantly. In the 
current implementation, this would not reduce the algorithm's memory demand, but one could easily modify the algorithm to keep only a 
fraction of the total galaxies in memory at a time which would be useful for surveys such as this one with large contiguous fields. 

Improvements in the reduction techniques are probably just as important as increasing the number of galaxies, since the B-mode 
contamination is evidence that there are still some systematic errors in the data. So far all surveys who have checked for B-modes in the data 
show this contamination at a significant level, although there are often scales at which the B-mode is small compared to the E-mode signal. 
Reducing these systematic effects will probably require both better PSF-removal algorithms as well as cleaner raw images. 
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Figure 8. The skewness of the aperture mass statistic using the integration technique for our CTIO 75 square degree survey. The E-mode ((AfJL)) is shown in 
blue, with error bars indicating the field-to-field variance of the measurement. The mixed and B-modes are the circles ((M 2 p M x )), triangles ((M ap M^ )), 
and crosses ((M 2 )). The black curve is the theoretical prediction for the concordance ACDM model. 
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APPENDIX A: KURTOSIS 



The calculation of the fourth moment of the aperture mass can likewise be made from the four-point correlation function: 



(M 4 )(R) = 24 



(M 3 M*)(R) 



(M 2 M* 2 )(R) = 8 
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where we define the natural components of the four-point correlation function as did Schneider & Lombardil 1200 



(A3) 

(although our 
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ordering is different) 
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) = (7( r )7( r + s)-y(r + t) 7 (r + u)e la ) 
) = (7( r )*7( r + s)-y(r + t)-y(r + u)e~ 4la ) 
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) = (7( r )7( r + s)7(r + t)* 7 (r + u)e~ 4la ) 
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The corresponding T functions are: 
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TqJ 7 (s, t' , u) — T 5 (s, t' , u) after a cyclic rotation of indices (2,3,4) 
where the q's are again the vectors from each vertex to the centroid: 

q L = (s + t' + w')/ 4 
q 2 = (t' + u - 3s)/4 
q 3 = (s + u - 3t')/4 
fl 4 = (a + t' - 3u')/4 

Finally, we can obtain the moments of M ap and M x from: 

(M£ P )(R) = ±-M{{M 4 )(R)+4{M 3 M*)(R) + 3{M 2 M* 2 )(R)) 
(M'i p M x )(R) = ]-Z({M 4 }(R) + 2{M 3 M*)(R)) 

O 

{M! P M 2 X )(R) = ijf (-(M 4 )(i?) + <M 2 M* 2 )(i?)) 
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